12/03/2019

Motivation

Standard Approach




Common to pool data or consider problems on a region by region basis.


This can make statistical problems more tractable.

Climate example

National Resource Management Regions

CSIRO and Bureau of Meteorology, 2015. Climate change in Australia information for Australia's natural resource management regions: Technical report.

Post-processing example


Whan, Kirien, and Maurice Schmeits. "Comparing area probability forecasts of (extreme) local precipitation using parametric and machine learning statistical postprocessing methods." Monthly Weather Review 146.11 (2018): 3651-3673.

Question






How should assign regions for the analysis of extremes?

Application

Create regions that are likely to experience similar impacts

Regionalisation

These regions can then inform our statistical analysis

Outline

1. Regionalisation

  • Clustering
  • Dependence of bivariate extremes
  • Practicalities
  • Classification

2. Visualise spatial dependence

  • Max-stable processes

3. Spatial post-processing

Regionalisation

Clustering Distance



Require: Notion of closeness between two locations


Want: Form clusters based on extremal dependence


Solution: The F-madogram distance





Bernard, Elsa, et al. "Clustering of maxima: Spatial dependencies among heavy rainfall in France." Journal of Climate 26.20 (2013): 7929-7937.

F-madogram distance

\[d(x_i, x_j) = \tfrac{1}{2} \mathbb{E} \left[ \left| F_i(M_{x_i}) - F_j(M_{x_j})) \right| \right]\] where \(M_{x_i}\) is the annual maximum rainfall at location \(x_i \in \mathbb{R}^2\) and \(F_i\) is the distribution function of \(M_{x_i}\).


Advantages:

  • Only use the raw block (annual) maxima
  • No information about climate or topography
  • Non-parametric estimation (fast)


Cooley, D., Naveau, P. and Poncet, P., 2006. Variograms for spatial max-stable random fields. In Dependence in probability and statistics (pp. 373-390). Springer, New York, NY.

Extremal Coefficient

For \(M_{x_i}\) and \(M_{x_j}\) with common GEV marginals, \(\theta(x_i - x_j)\) is \[\mathbb{P}\left( M_{x_i} \leq z, M_{x_j} \leq z \right) = \left[\mathbb{P}(M_{x_i}\leq z)\mathbb{P}(M_{x_i}\leq z)) \right]^{\tfrac{1}{2}\theta(x_i - x_j)}. %= \exp\left(\dfrac{-\theta(h)}{z}\right),\]

The range of \(\theta(x_i - x_j)\) is \([1 , 2]\).

Can write our distance measure as a function of the extremal coefficient, \(\theta(x_i - x_j)\), \[d(x_i, x_j) = \dfrac{\theta(x_i - x_j) - 1}{2(\theta(x_i - x_j) + 1)}.\]

Therefore the range of \(d(x_i, x_j)\) is \([0 , 1/6]\).

K-Medoids Clustering and PAM

  1. Randomly select an initial set of \(K\) stations. These are the set of the initial medoids.
  2. Assign each station, \(x_i\), to its closest medoid, \(m_k\), based on the F-madogram distance.
  3. For each cluster, \(C_k\), update the medoid according to \[m_k = \mathop{\mathrm{argmin}}\limits_{x_i \in C_k} \sum_{x_j \in C_k} d(x_i, x_j).\]
  4. Repeat steps 2. – 4. until the medoids are no longer updated.


Kaufman, L. and Rousseeuw, P.J., 1990. Partitioning around medoids (PAM). Finding groups in data: an introduction to cluster analysis, pp.68-125.

Result

Example

Consider the \(\max \{ \| x_i - x_j \|, 2\}\) as the clustering distance.

Density example

Gridded data

Spatial density is changed by land-sea and domain boundaries



Tendancy toward clusters of equal size



Clustering is in F-madogram space not Euclidean

Hierarchical Clustering

  1. Each station starts in its own cluster
  2. For each pair of clusters, \(C_k\) and \(C_k'\), define the distance between the clusters as \[d(C_k, C_{k'}) = \frac{1}{|C_k| |C_{k'}|} \sum_{x_k \in C_k} \sum_{x_{k'} \in C_{k'}} d(x_k, x_{k'}).\]
  3. Merge the the clusters with the smallest distance
  4. Update the distances relative to the new cluster
  5. Repaet steps 3 - 5, until all points are combined in a single cluster

Hierarchical Clustering

Back to the first example

Classify

  • Classify a station relative to its closest neighbours
  • Use a weighted classification \(w\)-kNN

Results

SHINY APP

Choosing a cut height

IMAGE

Similar Dependence

Where can we assume a common dependence structure for extremes?

Max-stable processes

Max-stable process

Shiny App

Level curves

Visualising Dependence

SWWA

TAS

Relevance to post-processing

Oesting et. al 2017

  • approach

  • cut the region into two

Conclusions

Conclusions